The Example Data for This Course

(put photo of elk here)

The data we will be working with throughout this course consist of GPS tracking data for Ya Ha Tinda elk (Cervus canadensis) from Hebblewhite et al 2008 and the Ya Ha Tinda Elk Project.

Some elk in this population are migratory (migrating to to Banff National Park in the summer), others are residential to the Ya Ha Tinda area year-round, and others have demonstrated “switching” behavior between residential and migratory.

This data is also available on Movebank.

Data Processing in R

Principles of Data Processing

Data that has been processed “smartly” will have the following features:

  • Compartamentalized - e.g., each step in your code/methods uses functions and the most efficient code possible

  • Interactive - e.g., leverage visualization and interactive tools

  • Generalizable - e.g., able to be applied to multiple individuals

  • Replicable - e.g., saving your code and data products regularly and NEVER overwriting the raw data

  • Well-Documented - e.g., commenting your code along the way, keeping files in organized folders, and storing methods in an external document as you go

Follow these guidelines and you will save yourself from many future data processing head aches!

Bringing Data into R

Data can be brought into R many ways:

  • read.csv - a Base R function to read in CSV files by calling the file path of wherever the file is stored

  • load - a Base R function to load in an Rda object (“R data object”)

  • getMovebankData - function from the “move” package (Kranstauber et al 2023) to retrieve Movebank datasets by name

Read a CSV File In

Excel loves to mess up date/time information, so be sure to check that your datetime column is formatted correctly (to include both the data AND time, with hours, minutes, & seconds, if applicable) before reading it into R.

The file path should correspond to wherever you saved your CSV file on your computer.

Note: It is helpful to first set your working directory (setwd), so that you don’t have to call the entire file every time.

elk_gps <- read.csv("./data/Elk_GPS_data.csv")

str(elk_gps)
## 'data.frame':    138433 obs. of  7 variables:
##  $ X                          : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ timestamp                  : chr  "3/25/2003 19:01:00" "3/25/2003 23:01:00" "3/26/2003 1:01:00" "3/26/2003 5:00:00" ...
##  $ location.long              : num  -115 -115 -115 -115 -115 ...
##  $ location.lat               : num  51.7 51.7 51.7 51.7 51.7 ...
##  $ migration.stage            : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ sensor.type                : chr  "gps" "gps" "gps" "gps" ...
##  $ individual.local.identifier: chr  "GP1" "GP1" "GP1" "GP1" ...

Load an Rda Object

Rda objects are a useful way to store data. Essentially, R stores your object(s) as a compressed “.rda” file (R file type). This does not work with raster data, but will work with most other object types.

It’s also good practice to save your intermediate R objects in a “Data” folder in your R project or repository. For example, a great time to save your data would be after processing!

You can save data using the save Base R function and then load it, using the same file path you saved it to.

When saving, don’t forget to add the file name and type at the end of the file path!

save(elk_gps, file="./data/elk_gps.rda")
load("./data/elk_gps.rda")

str(elk_gps)
## 'data.frame':    138433 obs. of  6 variables:
##  $ timestamp                  : chr  "3/25/2003 19:01:00" "3/25/2003 23:01:00" "3/26/2003 1:01:00" "3/26/2003 5:00:00" ...
##  $ location.long              : num  -115 -115 -115 -115 -115 ...
##  $ location.lat               : num  51.7 51.7 51.7 51.7 51.7 ...
##  $ migration.stage            : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ sensor.type                : chr  "gps" "gps" "gps" "gps" ...
##  $ individual.local.identifier: chr  "GP1" "GP1" "GP1" "GP1" ...

Load In Movebank Data

Movebank is an incredibly useful (online) resource and repository for storing and accessing tracking datasets for a variety of species.

Public data can be downloaded either online or through the “move” R package. After installing the package (install.package), we can load the package for use in our R session using the library Base R function.

library(move)

You will need to make a Movebank account and login first, using the movebankLogin function.

mylogin <- movebankLogin(username = 'YourUsername', password = 'yourpassword')

Now you can use your login information object within the getMovebankData function to access the study you are interested in.

Let’s access one of the Elk Movebank Datasets.

IMPORTANT - you first need to go the study page on Movebank, log in with your credentials, and under the “Download” tab, click “Download Data” and then agree to the license agreement. You can now download the data on the web to your computer OR use the function below to download the data.

elk_move <- getMovebankData(study = "Ya Ha Tinda elk project, Banff National Park, 2001-2023 (females)", login = mylogin, removeDuplicatedTimestamps=TRUE)
head(elk_move)
##       tag_id sensor_type_id external_temperature gps_dop height_above_ellipsoid
## 1 1200770822            653                   NA      NA                     NA
## 2 1200770822            653                   NA      NA                     NA
## 3 1200770822            653                   NA      NA                     NA
## 4 1200770822            653                   NA      NA                     NA
## 5 1200770822            653                   NA      NA                     NA
## 6 1200770822            653                   NA      NA                     NA
##   location_lat location_long manually_marked_outlier           timestamp
## 1     52.12410     -115.8044                         2001-12-13 07:01:12
## 2     52.11762     -115.8003                         2001-12-13 09:01:07
## 3     52.09611     -115.8281                         2001-12-14 09:01:05
## 4     52.09829     -115.8318                         2001-12-14 11:00:49
## 5     52.09482     -115.8042                         2001-12-14 17:02:19
## 6     52.12493     -115.8037                         2001-12-14 19:01:07
##                 update_ts visible deployment_id    event_id sensor_type
## 1 2024-04-22 14:52:38.396    true    3662883980 15155700828         GPS
## 2 2024-04-22 14:52:38.396    true    3662883980 15155702143         GPS
## 3 2024-04-22 14:52:38.396    true    3662883980 15155694994         GPS
## 4 2024-04-22 14:52:38.396    true    3662883980 15155693463         GPS
## 5 2024-04-22 14:52:38.396    true    3662883980 15155700885         GPS
## 6 2024-04-22 14:52:38.396    true    3662883980 15155700986         GPS
##   tag_local_identifier
## 1                 4049
## 2                 4049
## 3                 4049
## 4                 4049
## 5                 4049
## 6                 4049

The getMovebankData function will result in a “MoveStack” object, which is specially formatted for the “move” package functions.

We can do a quick plot of the elk locations in this dataset using the plot function on our MoveStack object.

plot(elk_move)

We can convert it to a basic R data frame object using the as.data.frame function.

elk_df <- as.data.frame(elk_move)

str(elk_df)
## 'data.frame':    1742248 obs. of  50 variables:
##  $ tag_id                 : num  1.2e+09 1.2e+09 1.2e+09 1.2e+09 1.2e+09 ...
##  $ sensor_type_id         : int  653 653 653 653 653 653 653 653 653 653 ...
##  $ external_temperature   : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ gps_dop                : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ height_above_ellipsoid : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ location_lat           : num  52.1 52.1 52.1 52.1 52.1 ...
##  $ location_long          : num  -116 -116 -116 -116 -116 ...
##  $ manually_marked_outlier: chr  "" "" "" "" ...
##  $ timestamp              : POSIXct, format: "2001-12-13 07:01:12" "2001-12-13 09:01:07" ...
##  $ update_ts              : chr  "2024-04-22 14:52:38.396" "2024-04-22 14:52:38.396" "2024-04-22 14:52:38.396" "2024-04-22 14:52:38.396" ...
##  $ visible                : chr  "true" "true" "true" "true" ...
##  $ deployment_id          : num  3.66e+09 3.66e+09 3.66e+09 3.66e+09 3.66e+09 ...
##  $ event_id               : num  1.52e+10 1.52e+10 1.52e+10 1.52e+10 1.52e+10 ...
##  $ sensor_type            : Factor w/ 1 level "GPS": 1 1 1 1 1 1 1 1 1 1 ...
##  $ tag_local_identifier   : chr  "4049" "4049" "4049" "4049" ...
##  $ location_long.1        : num  -116 -116 -116 -116 -116 ...
##  $ location_lat.1         : num  52.1 52.1 52.1 52.1 52.1 ...
##  $ optional               : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
##  $ sensor                 : Factor w/ 1 level "GPS": 1 1 1 1 1 1 1 1 1 1 ...
##  $ timestamps             : POSIXct, format: "2001-12-13 07:01:12" "2001-12-13 09:01:07" ...
##  $ trackId                : Factor w/ 206 levels "X4049","BL201",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ birth_hatch_latitude   : logi  NA NA NA NA NA NA ...
##  $ birth_hatch_longitude  : logi  NA NA NA NA NA NA ...
##  $ comments               : logi  NA NA NA NA NA NA ...
##  $ death_comments         : logi  NA NA NA NA NA NA ...
##  $ earliest_date_born     : logi  NA NA NA NA NA NA ...
##  $ exact_date_of_birth    : logi  NA NA NA NA NA NA ...
##  $ group_id               : logi  NA NA NA NA NA NA ...
##  $ individual_id          : num  1.2e+09 1.2e+09 1.2e+09 1.2e+09 1.2e+09 ...
##  $ latest_date_born       : logi  NA NA NA NA NA NA ...
##  $ local_identifier       : chr  "4049" "4049" "4049" "4049" ...
##  $ marker_id              : logi  NA NA NA NA NA NA ...
##  $ mates                  : logi  NA NA NA NA NA NA ...
##  $ mortality_date         : logi  NA NA NA NA NA NA ...
##  $ mortality_latitude     : logi  NA NA NA NA NA NA ...
##  $ mortality_longitude    : logi  NA NA NA NA NA NA ...
##  $ mortality_type         : logi  NA NA NA NA NA NA ...
##  $ nick_name              : logi  NA NA NA NA NA NA ...
##  $ offspring              : logi  NA NA NA NA NA NA ...
##  $ parents                : logi  NA NA NA NA NA NA ...
##  $ ring_id                : logi  NA NA NA NA NA NA ...
##  $ sex                    : chr  "f" "f" "f" "f" ...
##  $ siblings               : logi  NA NA NA NA NA NA ...
##  $ taxon_canonical_name   : chr  "Cervus elaphus" "Cervus elaphus" "Cervus elaphus" "Cervus elaphus" ...
##  $ timestamp_start        : chr  "2001-12-13 07:01:12.000" "2001-12-13 07:01:12.000" "2001-12-13 07:01:12.000" "2001-12-13 07:01:12.000" ...
##  $ timestamp_end          : chr  "2002-11-14 03:00:55.000" "2002-11-14 03:00:55.000" "2002-11-14 03:00:55.000" "2002-11-14 03:00:55.000" ...
##  $ number_of_events       : int  3247 3247 3247 3247 3247 3247 3247 3247 3247 3247 ...
##  $ number_of_deployments  : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ sensor_type_ids        : chr  "gps" "gps" "gps" "gps" ...
##  $ taxon_detail           : chr  "ssp. canadensis" "ssp. canadensis" "ssp. canadensis" "ssp. canadensis" ...

Data Processing Steps

Processing data will always be specific to your data and needs. Sometimes it can be helpful to do some processing and data cleaning outside of R (e.g., within Excel, especially for datetime information).

You find it useful to diagram or write out your data processing needs BEFORE trying to draft your code.

R is a powerful tool for quick, efficient, and reproducible data processing and cleaning. If there is ever something you don’t know how to do in R, a quick Google search or taking a look at one of the many R resources online (e.g., R-Bloggers or Stack Overflow) will likely eventually result in a solution.

Step 1: Re-Name or Drop Columns

The “[]” operator can be used to grab specific columns by their number in a dataframe.

Let’s drop the 4th and 5th columns from our “elk_gps” dataset.

elk_gps <- elk_gps[, -c(4:5)]

str(elk_gps)
## 'data.frame':    138433 obs. of  4 variables:
##  $ timestamp                  : chr  "3/25/2003 19:01:00" "3/25/2003 23:01:00" "3/26/2003 1:01:00" "3/26/2003 5:00:00" ...
##  $ location.long              : num  -115 -115 -115 -115 -115 ...
##  $ location.lat               : num  51.7 51.7 51.7 51.7 51.7 ...
##  $ individual.local.identifier: chr  "GP1" "GP1" "GP1" "GP1" ...

Columns can be renamed using the names Base R function and a vector of the names you want, as text (character format) and in the order you want.

Let’s rename our columns to “datetime”, “lon”, “lat”, and “id”.

Note the structure of your data as well. R has many different data structures but the main ones you will use are numeric, character (factor is similar but has levels), Posixct, and spatial. We will

names(elk_gps) <- c("datetime", "lon", "lat", "id")

str(elk_gps)
## 'data.frame':    138433 obs. of  4 variables:
##  $ datetime: chr  "3/25/2003 19:01:00" "3/25/2003 23:01:00" "3/26/2003 1:01:00" "3/26/2003 5:00:00" ...
##  $ lon     : num  -115 -115 -115 -115 -115 ...
##  $ lat     : num  51.7 51.7 51.7 51.7 51.7 ...
##  $ id      : chr  "GP1" "GP1" "GP1" "GP1" ...

Step 2: Convert DateTime Column to Posixct Format

R reads datetime data in a particular way. Converting your datetime column to “POSIXct” format will ensure that R reads that column as a datetime information, not just a character or text.

To format a column to be “POSIXt” format, you can use the as.POSIXct function.

Here, you need to be careful to specify the format of the datetime column exactly as it is, using Posixct syntax (eg, “%m” for month, “%d” for day, “%Y” for year, “%H” for hours, “%M” for minutes, and “%S” for seconds).

Let’s check the format of our datetime column:

elk_gps$datetime[1]
## [1] "3/25/2003 19:01:00"

Now we specify this format with POSIXct syntax in the “format” argument of the as.POSIXct function.

elk_gps$datetime <- as.POSIXct(elk_gps$datetime, format="%m/%d/%Y %H:%M:%S")

str(elk_gps)
## 'data.frame':    138433 obs. of  4 variables:
##  $ datetime: POSIXct, format: "2003-03-25 19:01:00" "2003-03-25 23:01:00" ...
##  $ lon     : num  -115 -115 -115 -115 -115 ...
##  $ lat     : num  51.7 51.7 51.7 51.7 51.7 ...
##  $ id      : chr  "GP1" "GP1" "GP1" "GP1" ...

Now that our datetime column is in POSIXct format, we can perform mathemetical operations on this column and return the results in units of time.

elk_gps$datetime[2] - elk_gps$datetime[1]
## Time difference of 4 hours

For example, we can use the difftime function to take the difference in time between our first and second observations in the datetime column, specifying the desired units for the output:

difftime(elk_gps$datetime[2] , elk_gps$datetime[1], units = "mins")
## Time difference of 240 mins

Step 3: Check for Missing Data

R stores missing values as “NA” (or sometimes, “NaN”).

You can check for NA values in a vector or column using the is.na function, which handily will return a vector the same length, with TRUE where there are NA’s and FALSE where there are not NA’s.

Let’s use the subset Base R function to select only the rows in our elk data where there are no NA’s for datetime, lat, and lon.

elk_gps2 <- subset(elk_gps, is.na(datetime)==FALSE & is.na(lon)==FALSE & is.na(lat)==FALSE)

str(elk_gps2)
## 'data.frame':    138429 obs. of  4 variables:
##  $ datetime: POSIXct, format: "2003-03-25 19:01:00" "2003-03-25 23:01:00" ...
##  $ lon     : num  -115 -115 -115 -115 -115 ...
##  $ lat     : num  51.7 51.7 51.7 51.7 51.7 ...
##  $ id      : chr  "GP1" "GP1" "GP1" "GP1" ...

Step 4: Check for Duplicate Data, Sort By Time

Duplicated data can falsely inflate your data with observations and result in conflicts with various functions used for analyses later.

We will use the “dplyr” R package for more data organizing and cleaning.

library(dplyr)

We will use the “|>” native R pipe to efficiently run multiple functions at once on an object.

Before filtering out duplicate datetime data (using the filter function to select the rows that are not duplicated, using the “!” condition) and sorting our data by datetime information (using the arrange function, which automatically sorts in increasing order), we first need to group the data by each individual in our dataset (our “id” column) so that the functions are applied to each individual separately.

We then ungroup our data to combine the results among all individuals and use the data.frame function to ensure the resulting object is a dataframe structure.

elk_gps3 <- elk_gps2 |>
  group_by(id) |>
  filter(!duplicated(datetime)) |>
  arrange(datetime) |>
  ungroup() |>
  data.frame()

head(elk_gps3)
##              datetime       lon      lat   id
## 1 2001-12-13 07:01:00 -115.8043 52.12410 4049
## 2 2001-12-13 09:01:00 -115.8003 52.11762 4049
## 3 2001-12-14 09:01:00 -115.8281 52.09611 4049
## 4 2001-12-14 11:00:00 -115.8318 52.09829 4049
## 5 2001-12-14 17:02:00 -115.8042 52.09482 4049
## 6 2001-12-14 19:01:00 -115.8037 52.12493 4049

Step 5: Make Data Spatial

Spatial data in R comes in multiple formats (vectors, e.g. points, lines, and polygons, or rasters). You may find the online resource Spatial Data Science with R and “terra” helpful for further information.

There are multiple packages in R for formatting spatial data but a fan favorite for vector data is the “sf” package.

Raster data can be processed using the “raster” package or more recently, the “terra” package.

library(sf)

If you are working with movement/tracking data, you should have columns with geographic coordinate information for each observation in the dataset, with this information often being stored in latitude/longitude format (units: angular decimal degrees).

Let’s convert our elk data to “sf” format, using the st_as_sf function, specifying the columns with our coordinate information (latitude first, then longitude) and the Coordinate Reference System EPSG code (4326 corresponds to WGS 1984, a Geographic Coordinate System for data with coordinates in units of decimal degrees).

Our new sf object is a “POINt” vector type, with each location/observation in the data having a corresponding POINT geometry. The sf package has many amazing functions to manipulate spatial data, including spatial operations and conversion to different vector types.

elk_sf <- elk_gps3 |>
  st_as_sf(coords = c("lon","lat"), crs=4326)

elk_sf
## Simple feature collection with 138421 features and 2 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -116.4028 ymin: 51.38134 xmax: -115.3461 ymax: 52.1541
## Geodetic CRS:  WGS 84
## First 10 features:
##               datetime   id                   geometry
## 1  2001-12-13 07:01:00 4049  POINT (-115.8043 52.1241)
## 2  2001-12-13 09:01:00 4049 POINT (-115.8003 52.11762)
## 3  2001-12-14 09:01:00 4049 POINT (-115.8281 52.09611)
## 4  2001-12-14 11:00:00 4049 POINT (-115.8318 52.09829)
## 5  2001-12-14 17:02:00 4049 POINT (-115.8042 52.09482)
## 6  2001-12-14 19:01:00 4049 POINT (-115.8037 52.12493)
## 7  2001-12-14 21:00:00 4049 POINT (-115.7985 52.12587)
## 8  2001-12-15 01:01:00 4049 POINT (-115.7973 52.10518)
## 9  2001-12-15 03:01:00 4049 POINT (-115.7875 52.08844)
## 10 2001-12-15 05:01:00 4049 POINT (-115.8239 52.09922)

Visualizations

Visualizing data is an important step in the data processing and analysis stages.

Learning how to visualize your data correctly can help you catch errors in your code, outliers in your data, and interesting patterns that will inform your analysis choices.

Visualization can be done with Base R plotting or the “ggplot2” R package, which is excellent for creating complex plots in one line of code.

library(ggplot2)

Visualize Tracks

Base R

We can use base R to specify a “fancy” plot showing each of our tracks in multiple dimensions, including latitude/longitiude, latitude versus time, and longitude versus time.

Note that using a geographic coordinate system (GCS) with units in angular degrees (lat/lon) can be useful for mapping and observing patterns in the data but that direct, measurable comparisons can only be made with a projected coordinate system (PCS), where the units are in measurable units (e.g., meters). We will demonstrate how to work with projected coordinates in the next lab.

Note: You may find this online resource on Plotting with Base R useful if you are unfamiliar with the plot function and its various arguments.

Let’s make a base R plot of the track for the individual, “GP1”

GP1 <- subset(elk_gps3, id == "GP1")

par(mar = c(0,4,0,0), oma = c(4,0,5,2), xpd=NA)
layout(rbind(c(1,2), c(1,3)))
plot(GP1$lon, GP1$lat, asp = 1, type="o", ylab="Latitude", xlab="Longitude")
plot(GP1$datetime, GP1$lon, type="o", xaxt="n", ylab="Longitude", xlab="")
plot(GP1$datetime, GP1$lat, type="o", ylab="Latitude", xlab="Datetime")
title(paste("ID", GP1$id[1]), outer = TRUE)

We can also write our own little function, using the function function, to apply this plot to all individuals at once.

For more information/help on writing functions in R, check out this helpful online resource, Writing Functions in R.

plotTrack_latlon <- function(dataframe){
  par(mar = c(0,4,0,0), oma = c(4,0,5,2), xpd=NA)
  layout(rbind(c(1,2), c(1,3)))
  plot(dataframe$lon, dataframe$lat, asp = 1, type="o", ylab="Latitude", xlab="Longitude")
  plot(dataframe$datetime, dataframe$lon, type="o", xaxt="n", ylab="Longitude", xlab="")
  plot(dataframe$datetime, dataframe$lat, type="o", ylab="Latitude", xlab="Datetime")
  title(paste("ID", dataframe$id[1]), outer = TRUE)
}

We can use the “plyr” package with the d_ply function to apply our new function grouped by a variable of interest, here each individual. This function will simply return the output of the function used (see also ddply, which we will use later to return a dataframe based on a function’s output).

Note: plyr and dplyr do not “play nice” with each other. You need to load the dplyr package AFTER the plyr package to avoid conflicts and function masking. We can use the detach function to unload our dplyr package, then use the library function to load the plyr and dplyr packages, consecutively. If you see that a particular function is “masked” you can use the package name and “:” before the name of a particular function from a particular package to ensure that masking (which happens when you have functions of the same name in two packages that are loaded) does not occur (e.g., dplyr::select()).

detach("package:dplyr", unload=TRUE)

library(plyr)

library(dplyr)
elk_gps3 |> d_ply("id", plotTrack_latlon)

ggplot

The ggplot function takes arguments for the data object to be plotted, the x and y axis variables to be plotted (within the aes() function), a variety of additional aesthetic arguments (e.g., color or linewidth), and additive functions that define the plot type.

library(ggplot2)

For more help with plotting with ggplot, check out this helpful online resource, ggplot2 with the tidyverse.

Let’s make a plot for the elk individual “GP1”, using latitude for the X axis, longitude for the y-axis, and using points and a connecting “path” between points to plot the track.

ggplot(data = GP1, aes(x = lon, y = lat)) +
  geom_point()+
  geom_path(size = 0.5, color = "darkgrey") +
  theme_classic()

Now let’s visualize all individuals at once, using the additional facet_wrap function to facet or group our plots by each elk id. The “scale = ‘free’” argument allows each plot to have its own intuitive x and y axis limits (“scale=‘fixed’ does the opposite).

ggplot(data = elk_gps3, aes(x = lon, y = lat)) +
  geom_path(size = 0.5, color = "darkgrey") +
  geom_point() +
  theme_classic() +
  facet_wrap(~id, scale="free", ncol=3)

We can also visualize latitude 9or longitude) versus datetime.

This can be especially helpful for identifying interesting behavioral patterns in animal movement (e.g., residency vs transiting), which can inform your choice of analysis method later on.

ggplot(data = elk_gps3, aes(x = datetime, y = lat)) +
  geom_path(size = 0.5) +
  xlab("DateTime") + ylab("Latitude") +
  theme_classic() +
  facet_wrap(~id, scale="free", ncol = 3)

Spatial Plots and Mapping

Base R plots

sf objects can be plotted by their attributes using the plot Base R function.

For example, we can plot the points (or “geometry”, which are points) for the “GP1” elk individual, after subsetting its data from the larger “elk_sf” object we created above.

GP1 <- subset(elk_sf, id == "GP1")

plot(GP1$geometry, pch = 19)

Without a background map, this plot is not very informative!

ggplot and ggmap

The “ggmap” package works helpfully with ggplot functions and sf data to add open-source basemaps. More info on the package can be found on the ggmap Github Repo Page.

library(ggmap)

Importantly, you need internet access to download the open-source base map “tiles”.

You also FIRST need to register for a free API key with Stadia Maps at their API Signup Page (see ?register_stadiamaps).

After you complete your registration, go to your client dashboard and create a new “property” (e.g., “Nicki’s API”). You can now create a new API key (save this somewhere on your computer so that you can find it if needed). You can also find instructions under the Stadia API Documentation Page.

key <- "e644fe03-1f7b-4b05-87a8-c65335eb4625"

register_stadiamaps(key, write = FALSE)

Now, before creating our map, we need to define the spatial extent for the basemap, defining the extent as a “box” defined by 2 points (left, bottom and top, right).

GP1_bbox <- st_bbox(GP1)

names(GP1_bbox) <- c("left","bottom","right","top")

GP1_bbox
##       left     bottom      right        top 
## -115.49872   51.66008 -115.46927   51.68975
GP1_box <- c(left = -115.6, bottom = 51.5, right = -115.4, top = 51.75)

We might want to add a buffer around this bbox, to have a bigger spatial extent than our points. For example, we could 5 degrees in each direction:

GP1_bbox
##       left     bottom      right        top 
## -115.49872   51.66008 -115.46927   51.68975

Next, we use the get_map function on our new bbox to grab the basemap, specifying source “stadia” to use the open source base maps. Note that there are different maptypes available, which will change the visual presentation of the basemap (see ?get_map, we are using “stamen_terrain”).

basemap <- get_map(GP1_box,
               source = "stadia",
               maptype  = "stamen_terrain")

Annoyingly, defining an sf object removes the spatial columns from the data (they are stored inside an st_geometry column instead). We can extract each of the lat/long columns, using the st_coordinates function on our sf object (stored as a matrix, with long first, then lat).

GP1$lon <- st_coordinates(GP1)[,1]
GP1$lat <- st_coordinates(GP1)[,2]
ggmap(basemap, extent = "normal") +
  geom_point(data = GP1, color="red") +
  theme_classic()

ggplot and ggspatial

We can use the “ggspatial” package to add open map tiles to the background of our map.

Note that this can be memory-intensive if you are plotting a lot of data at a high resolution …

library(ggspatial)

The “ggspatial” R package is great for maps using Open Street Map (OSM) base map tiles (open source).

The annotation_map_tile function allows you to specify a background map type (“type=”) and zoom level (“zoom=”, where a higher zoom is a higher resolution but may take longer to render). You can run the code “rosm::osm.types” to see all the different map tile types available (we will use the “osm” one).

You can also add some nice map features, such as a scale bar (function annotation_scale()) and a north arrow (function annotation_north_arrow(), with arguments for specifying height, width, padding dimensions). You can also specify nicer axis labels and breaks using the scale_y_continuous and scale_x_continuous functions, specifying the

You can then add on your other, regular ggplot functions, such as geom_sf and theme options.

box <- st_bbox(c(xmin = -115.6, xmax = -115.4, ymax = 51.5, ymin = 51.75), crs = st_crs(4326))

ggplot() +
  annotation_map_tile(type = 'osm', zoom = 12) +
  annotation_scale()+
  annotation_north_arrow(height=unit(0.5,"cm"), width=unit(0.5,"cm"), pad_y = unit(1,"cm"))+
  shadow_spatial(box)+
  ylab("Latitude") + xlab("Longitude")+
  scale_y_continuous(breaks= c(51.5, 51.6, 51.7, 51.75))+
  scale_x_continuous(breaks= c(-115.6, -115.5, -115.4))+
  geom_sf(data=GP1,aes(), color="orange", size=2)+
  theme_classic()

Export Static Map

If you wanted to export your map, you could do so by sandwhiching your map code between the file type function you want the image to be saved as (e.g., jpeg or png) and the function dev.off.

jpeg(file="./GP1_Map.jpg", units="in", width=4, height=7,res=300)
ggplot() +
  annotation_map_tile(type = 'osm', zoom = 12) +
  annotation_scale()+
  annotation_north_arrow(height=unit(0.5,"cm"), width=unit(0.5,"cm"), pad_y = unit(1,"cm"))+
  shadow_spatial(box)+
  ylab("Latitude") + xlab("Longitude")+
  scale_y_continuous(breaks= c(51.5, 51.6, 51.7, 51.75))+
  scale_x_continuous(breaks= c(-115.6, -115.5, -115.4))+
  geom_sf(data=GP1,aes(), color="orange", size=2)+
  theme_classic()
dev.off()

mapview

Interactive mapping in R with the “mapview” R package is a useful way to visualize and engage with spatial data.

library(mapview)

Let’s create spatial tracks of all of our elk data, using our “elk_sf” object.

We use the summarize function and the st_cast function with the group_by function from the dplyr package to first create individual elk tracks, as LINESTRINGS.

elk_tracks <- elk_sf |> 
  group_by(id) |> 
  summarize(do_union=FALSE) |> 
  st_cast("LINESTRING")

Now we can use the mapview function to plot our tracks, specifying to color the tracks by different ids with the “zcol” argument. Note that the mapview function can plot any spatial data and has a variety of additional controls/arguments available (see Advanced Mapview Controls for more options and examples).

mapview(elk_tracks, zcol="id")

Save Your Processed Data

After processing your data, it is always useful to save it as an intermediate data object. Rda format is perfect for this, as it is an R specific file type that will load your saved objects directly back into your Global Environment for use.

elk_processed <- elk_gps3

save(elk_processed, file="./data/elk_processed.rda")